I'm new to Airflow. I was able to follow a video and create the docker-compose yml file, Dockerfile, and a dag file. I am able to view my dag and run it. In my script, I'm trying to open a text file (.txt
), but I get the following error: FileNotFoundError: \[Errno 2\] No such file or directory
.
I have the text file in the correct location. The script runs on my local python environment. I don't know why it's showing as an error when I run in it in Airflow.
My docker-compose.yml
, Dockerfile
, and dag files will be shown below. I'd appreciate any sort of help! Thank you!
docker-compose.yml
version: '3.7'
services:
postgres:
image: postgres:9.6
environment:
- POSTGRES_USER=airflow
- POSTGRES_PASSWORD=airflow
- POSTGRES_DB=airflow
logging:
options:
max-size: 10m
max-file: "3"
webserver:
build: ./dockerfiles
restart: always
depends_on:
- postgres
environment:
- LOAD_EX=n
- EXECUTOR=Local
logging:
options:
max-size: 10m
max-file: "3"
volumes:
- ./dags:/usr/local/airflow/dags
# - ./plugins:/usr/local/airflow/plugins
ports:
- "8080:8080"
command: webserver
healthcheck:
test: ["CMD-SHELL", "[ -f /usr/local/airflow/airflow-webserver.pid]"]
interval: 30s
timeout: 30s
retries: 3
Dockerfile
FROM puckel/docker-airflow:1.10.9
RUN pip install requests
RUN pip install bs4
RUN pip install pandas
RUN pip install xlrd
RUN pip install openpyxl
dag file
try:
from datetime import timedelta
from airflow import DAG
from airflow.operators.python_operator import PythonOperator
from datetime import datetime
import requests
from bs4 import BeautifulSoup
import pandas as pd
import smtplib
from email.message import EmailMessage
import os
import sys
import xlrd
from datetime import datetime
from openpyxl import load_workbook
print("All Dag modules are ok.........")
except Exception as e:
print("Error {} ".format(e))
def craigslist_search_function():
***PYTHON CODE***
with DAG(
dag_id="craigslist_dag",
schedule_interval="*/30 * * * *",
default_args={
"owner": "airflow",
"retries": 1,
"retry_delay": timedelta(minutes=5),
"start_date": datetime(2022, 1, 1),
},
catchup=False) as f:
craigslist_search_function = PythonOperator(
task_id="craigslist_search_function",
python_callable=craigslist_search_function)
I was expecting it to run the script with no issues. The script works perfectly fine in my local python environment. I don't know why it does not work in Airflow.